\section{Conclusions}
The goal of this article is to review the knowledge that I 
had learnt during the winter holiday and pave the way for 
further study in the future. First we formulated the framework
of neural networks, introduced the basic component in network 
and the gradient descent optimization together with the computation
method: backward propagation. Then we summarized the dominating
problems, especially the vanishing/explosion and the saddle points,
in training a neural network and provided a few tricks
including LSUV initialization and Adam optimization that
help boost the training efficiency. Finally, we went through
the main three questions about the mechanism of neural networks
and obtained some straight-forward intuition like the landscape
comprised of flat basins. 
\par There are several potential work to be done in the 
future. The tensorflow and keras framework can be used to 
compare the performance of various tricks and empirically 
test the results given in \autoref{sec:Question}. Besides,
due to lack of mathematic knowledge, the majority work in 
the generalization is beyond my comprehension. With further 
study of the basic convex optimization, a good command of 
mathematic terminology and definition may help me have more
insight into the essence of neural networks generalization.
